SLIP

SLIP Technology Browser Exercise I

November 15, 2001

Obtaining informational Transparency with Selective Attention

Dr. Paul S. Prueitt

President, OntologyStream Inc

November 15, 2001

SLIP Technology Browser Exercise I

{Eventname, d_port}

November 15, 2001

One needs two WinZip files, vSLIP and wSLIP.

An analytic conjecture was developed that linked defender ports with a non-specific relationship. A RealSecure summary intrusion event database was used. We used 14,475 records from April 15, 2001. The RealSecure columns are:

{ record, ename, protocol, s_port, d_port, s_addname, d_addname, epriority }

s_addname is the IP address of the source and d_addname is the IP of the defender.

Let us review what is an analytics conjecture.

Figure 1: A simplest form of an analytic conjecture

Formally we have:

( a₁ , b ) + ( a₂ , b ) à < a₁ , r, a₂ >

where r is the non-specific relationship. The “b” values are from one column in the intrusion event log and the “a” values are from a second column in the intrusion event log. We call “b” the “first name” and “a” the “second name”. The set { a } define the sets of atoms that are categorized. The set { b } provides means to define the incident level events that are consequent of the emergent computing technique that we invented. Incident event description can be automatically constructed using emergent computing and a reordering of the values of a derived Report. This fact was discovered on November 15^th, 2001 (see the following example).

The Example:

Let the first column, (b), be the RealSecure Intrusion Event designation (ename) and the second column, (a), be the d_port. The unique elements of the second column of the RealSecure events will a superset of the set of atoms for this analytic conjecture. There are 602 atoms in the top-level category A1 of the associated SLIP Framework.

There are 49 unique ename values in the event log. These are the b values that can be used in the development of event maps.

The set of paired d_port values has 47,780 paired values, each part of the pair being a port value. The pairs are defined through the analytic conjecture graph, Figure 1.

Start the SLIP.exe in a folder with a folder named ‘data”. You need only have two text files to start with.

1) Paired.txt is the file containing the 47,780 pairs of port values.

2) Datawh.txt (Data Warehouse) is the file containing the 14,475 RealSecure summary events records.

These two files are memory mapped and then searched using new algorithms invented for this purpose. Paired.txt is searched several hundred thousands times to produce the clustering. Datawh.txt is searched up to a few hundred times in order to produce a report from a cluster of atoms. The report has all of those intrusion event log records that have a value equal to one of the atoms in the category. When the Report is ordered by first name, then the structure of incident level events is revealed.

Clearly the first question is about if this retrieval is the right retrieval. There are two factors here. One is the order in which the records are shown, and the other is the examination of the non-specific relationship that defined the clustering.

Exercise:

Take the winzip file called vSLIP and unzip it anywhere in a empty directory. Click on SLIP.exe and when the browser comes up then type in “extract” in the command line. When the hour glass goes away, click on the A1 node. Type “mag 10” in the command line. This will give you Figure 1a.

a b

Figure 1: First step of the exercise

Typing in “c 100” for cluster 100,000 iterations will produce something like Figure 1 b.

Selecting on A1 gives the view Figure 2a. Selecting on B1 gives the view 2b.

a b

Figure 2: Drilling down into the data

In Figure 2a we have identified a cluster of 47 elements, at magnification 10. These are d_ports with a nonspecific relationship defined by the fact that a single event name is associated in the original database with two or more d_ports. The returned report has 383 records – the first few one can see in the Report window in Figure 3.

It is interesting to note that this cluster is diffuse and not narrowly defined. However, the reader can use the software to scatter gather the atoms of this cluster, when considered by itself and see that all of the atoms immediately go to a single spike. In Figure 2b the magnification is set at 20. Type in “help” to see the necessary command.

Figure 3. Category B1

Starting with Figure 1, one can click on Plot and use the “rnd” command to randomize the atoms.

Use the command “x’ where x is any number between 0 and 360 to create an indicator line (Figure 4a) . Use the command “x, y” where x and y are any numbers between 0 and 360 to create an indicator bracket. (One cannot go across 0 in the current version.) Use “x, y -> Tag”, where tag is any short name, to create a new category (Figure 4b).

a b

Figure 4. The use of the indicator lines and brackets

Clustering (“c”) will show that B1, C1, C2, and C3 are indeed a prime and that each element has at least one other element in the category that is related by the non-specific relationship. Randomizing A1 will allow one to reselect a cluster. By inspection, one can see that any cluster selected will be one of the four already identified. One can delete nodes by physically removing the folders in the Data folder and typing “load”. The Browser reads the folder structure, much like .NET and Java programs to find what is available for display.

For B1 only, the data folders contain the report that is currently generated by the Browser, but also two other texts called Report2.txt and Report3.txt in the B3 folder. One can get the effect of ordering the report by the Event name or the d_port. Just rename the files. The default ordering is the report number. The manipulation of the reports is still under development.

The files to view these structures (Figure 2, and Figure 3) are compressed into wSLIP. You may call Paul Prueitt at 703-981-2676 about questions. Additional investigation of clusters and reports can be made as practice. Comments about the interface should be sent to beadmaster@ontologyStream.com, to help in the continuing design.

One can take the residue, at the B level, by clicking on A1 and typing residue. The new subset (of 542 atoms) will cluster up very much as in Figure 5. These prime categories have all of the features of a single event as characterized by the chain relationship that are causing the clustering. The small size is just to get an example that is easy to look at. In this example this is not possible as the prime categories are easily identified. The current version does not generate reports if the prime has more than 50 atoms. As of November 14^th, this has been fixed by is not yet released.

Figure 5: There are four prime events when viewed with this Analytic Conjecture.

The zipped file wSLIP will open up to the view of data as in Figures 5 and 6. However the only report is for B1. The next exercise will demonstrate a functioning Report generator where the Report is an object and can be viewed in two different ordering (temporally and by firstname order – as in Figure 5). A incident event level event map will also be automatically constructed (see Figure 6 and 7).`

The ordering of the Report by the Firstname (in this case ename) shows exactly why the entire category B1 is a prime. One can see this visually using the software, by clicking on B1, typing “rnd” to randomize and “c” to cluster.

Figure 6: The top and bottom of the Report ordered by the Analytic Conjecture

In Figure 7 we see three groups of event names. The three event names have become well specified by the link analysis (Figure 1).

Figure 7: The event map for the category B1

The ordering by the firstname (the b values) from the analytic conjecture produces a visual representation of one specific chaining relationship that are governed by single b values (an event name in this case). This suggests a natural way to specify an event map for each of the prime categories that are found through the clustering process.

Figure 8: A graphic depicting the construction and display of the event map

In Figure 6 we have mocked up how the event maps might look once we put the algorithms in place. All of the Stream_DoS RealSecure intrusion events (there are four of them) are linked together via two non-specific relationships

{ < 1169, r, 1157>, <1157, r, 1614> }

as can be seen in Figure 5. We then have the relationship <1614, r, x> with x having 31 distant values. That links the Stream_DoS intrusion events to 131 TCP_Overlap_Data intrusion events. These intrusion events are then linked to the 6 Windows_Access_Error intrusion events via common use of the ports 1781 and 1853.

The Incident events are then:

{< Stream_DoS, 1614, TCP_Overlap_Data>, < TCP_Overlap_Data, 1781, Windows_Access_Error> ,

< TCP_Overlap_Data, 1614, Windows_Access_Error> }

Other aspects of the event corresponding to category B1 can be developed by looking at the other data such as common IP address usage.

These incident level events are the primary work product that is produced by the SLIP automatically.

It is seen that intrusion events are at a level of organization and the incident events self organize through link analysis and emergent computing.